23 research outputs found
Agents, Bookmarks and Clicks: A topical model of Web traffic
Analysis of aggregate and individual Web traffic has shown that PageRank is a
poor model of how people navigate the Web. Using the empirical traffic patterns
generated by a thousand users, we characterize several properties of Web
traffic that cannot be reproduced by Markovian models. We examine both
aggregate statistics capturing collective behavior, such as page and link
traffic, and individual statistics, such as entropy and session size. No model
currently explains all of these empirical observations simultaneously. We show
that all of these traffic patterns can be explained by an agent-based model
that takes into account several realistic browsing behaviors. First, agents
maintain individual lists of bookmarks (a non-Markovian memory mechanism) that
are used as teleportation targets. Second, agents can retreat along visited
links, a branching mechanism that also allows us to reproduce behaviors such as
the use of a back button and tabbed browsing. Finally, agents are sustained by
visiting novel pages of topical interest, with adjacent pages being more
topically related to each other than distant ones. This modulates the
probability that an agent continues to browse or starts a new session, allowing
us to recreate heterogeneous session lengths. The resulting model is capable of
reproducing the collective and individual behaviors we observe in the empirical
data, reconciling the narrowly focused browsing patterns of individual users
with the extreme heterogeneity of aggregate traffic measurements. This result
allows us to identify a few salient features that are necessary and sufficient
to interpret the browsing patterns observed in our data. In addition to the
descriptive and explanatory power of such a model, our results may lead the way
to more sophisticated, realistic, and effective ranking and crawling
algorithms.Comment: 10 pages, 16 figures, 1 table - Long version of paper to appear in
Proceedings of the 21th ACM conference on Hypertext and Hypermedi
Detecting and Tracking the Spread of Astroturf Memes in Microblog Streams
Online social media are complementing and in some cases replacing
person-to-person social interaction and redefining the diffusion of
information. In particular, microblogs have become crucial grounds on which
public relations, marketing, and political battles are fought. We introduce an
extensible framework that will enable the real-time analysis of meme diffusion
in social media by mining, visualizing, mapping, classifying, and modeling
massive streams of public microblogging events. We describe a Web service that
leverages this framework to track political memes in Twitter and help detect
astroturfing, smear campaigns, and other misinformation in the context of U.S.
political elections. We present some cases of abusive behaviors uncovered by
our service. Finally, we discuss promising preliminary results on the detection
of suspicious memes via supervised learning based on features extracted from
the topology of the diffusion networks, sentiment analysis, and crowdsourced
annotations
Long-term real-world experience with ipilimumab and non-ipilimumab therapies in advanced melanoma: the IMAGE study.
Funder: This work was supported by Bristol Myers Squibb (no grant number is applicable).BACKGROUND: Ipilimumab has shown long-term overall survival (OS) in patients with advanced melanoma in clinical trials, but robust real-world evidence is lacking. We present long-term outcomes from the IMAGE study (NCT01511913) in patients receiving ipilimumab and/or non-ipilimumab (any approved treatment other than ipilimumab) systemic therapies. METHODS: IMAGE was a multinational, prospective, observational study assessing adult patients with advanced melanoma treated with ipilimumab or non-ipilimumab systemic therapies between June 2012 and March 2015 with ≥3 years of follow-up. Adjusted OS curves based on multivariate Cox regression models included covariate effects. Safety and patient-reported outcomes were assessed. RESULTS: Among 1356 patients, 1094 (81%) received ipilimumab and 262 (19%) received non-ipilimumab index therapy (systemic therapy [chemotherapy, anti-programmed death 1 antibodies, or BRAF ± MEK inhibitors], radiotherapy, and radiosurgery). In the overall population, median age was 64 years, 60% were male, 78% were from Europe, and 78% had received previous treatment for advanced melanoma. In the ipilimumab-treated cohort, 780 (71%) patients did not receive subsequent therapy (IPI-noOther) and 314 (29%) received subsequent non-ipilimumab therapy (IPI-Other) on study. In the non-ipilimumab-treated cohort, 205 (78%) patients remained on or received other subsequent non-ipilimumab therapy (Other-Other) and 57 (22%) received subsequent ipilimumab therapy (Other-IPI) on study. Among 1151 patients who received ipilimumab at any time during the study (IPI-noOther, IPI-Other, and Other-IPI), 296 (26%) reported CTCAE grade ≥ 3 treatment-related adverse events, most occurring in year 1. Ipilimumab-treated and non-ipilimumab-treated patients who switched therapy (IPI-Other and Other-IPI) had longer OS than those who did not switch (IPI-noOther and Other-Other). Patients with prior therapy who did not switch therapy (IPI-noOther and Other-Other) showed similar OS. In treatment-naive patients, those in the IPI-noOther group tended to have longer OS than those in the Other-Other group. Patient-reported outcomes were similar between treatment cohorts. CONCLUSIONS: With long-term follow-up (≥ 3 years), safety and OS in this real-world population of patients treated with ipilimumab 3 mg/kg were consistent with those reported in clinical trials. Patient-reported quality of life was maintained over the study period. OS analysis across both pretreated and treatment-naive patients suggested a beneficial role of ipilimumab early in treatment. TRIAL REGISTRATION: ClinicalTrials.gov , NCT01511913. Registered January 19, 2012 - Retrospectively registered, https://clinicaltrials.gov/ct2/show/NCT01511913
A framework for analysis of anonymized network flow data
Many projects analyze application overlay networks on the Internet using packet analysis and network flow data. This is infeasible on many networks: either the volume of data makes packet inspection intractable, or privacy concerns forbid packet capture and require the dissociation of network flows from users ’ identities. We describe a framework for exploration of usage patterns even under circumstances where the only available data is anonymized flow records. We offer two proofs of concept using data gathered from Internet2. In the first, we uncover distributions and scaling relations in host-to-host networks with implications for capacity planning and application design. In the second, we classify network applications based on properties of their overlay networks, yielding a taxonomy that allows us to identify the functions of unknown applications.
Visual comparison of search results: A censorship case study
Understanding the qualitative differences between the sets of results from different search engines can be a difficult task. How many links must you follow from each list before you can reach a conclusion? We describe a user interface that allows users to quickly identify the most significant differences in content between two lists of Web pages. We have implemented this interface in CenSEARCHip, a system for comparing the effects of censorship policies on search engines
On the Lack of Typical Behavior in the Global Web Traffic Network
We offer the first large-scale analysis of Web traffic based on network flow data. Using data collected on the Internet2 network, we constructed a weighted bipartite clientserver host graph containing more than 18 × 10^6 vertices and 68 × 10^6 edges valued by relative traffic flows. When considered as a traffic map of the World-Wide Web, the generated graph provides valuable information on the statistical patterns that characterize the global information flow on the Web. Statistical analysis shows that client-server connections and traffic flows exhibit heavy-tailed probability distributions lacking any typical scale. In particular, the absence of an intrinsic average in some of the distributions implies the absence of a prototypical scale appropriate for server design, Web-centric network design, or traffic modeling. The inspection of the amount of traffic handled by clients and servers and their number of connections highlights non-trivial correlations between information flow and patterns of connectivity as well as the presence of anomalous statistical patterns related to the behavior of users on the Web. The results presented here may impact considerably the modeling, scalability analysis, and behavioral study of Web applications